Subsyllabic Tone Units for Reducing Physiological Effects in Automatic Tone Recognition for Connected Mandarin
نویسندگان
چکیده
This paper presents our attempt to model physiological transition effect on syllable F0 contour in order to improve lexical tone recognition performance for Mandarin Chinese. We suggested that a syllable F0 contour consists of three segments: onset course, tone nucleus and offset course. Among the three segments, only tone nucleus contains key features for tone recognition, and the other two result from physiological transition effect of human vocal cords. Therefore, the tone recognizer performance can be improved by only focusing on tone nuclei and discarding the other two segments. Segmentation of the three segments was achieved by our proposed method. Contextdependent tonal models, which are trained on tone nucleus features, were also introduced to model contextual tone coarticulation effects for tone recognition. Advantages of the proposed methods were proved through tone recognition experiments of continuous speech of Mandarin.
منابع مشابه
Improved tone modeling for Mandarin broadcast news speech recognition
Tone has a crucial role in Mandarin speech in distinguishing ambiguous words. Most state-of-the-art Mandarin automatic speech recognition systems adopt embedded tone modeling, where tonal acoustic units are used and F0 features are appended to the spectral feature vector. In this paper, we combine the embedded aproach (using improved F0 smoothing) with explicit tone modeling in rescoring the ou...
متن کاملIncorporating tone-related MLP posteriors in the feature representation for Mandarin ASR
Tone has a crucial role in Mandarin speech in distinguishing ambiguous words. In most state-of-the-art Mandarin automatic speech recognition systems, tonal acoustic units are used and F0 features are appended to the spectral features (MFCC/PLP). However, a tone depends on the F0 contour of a time span much longer than a frame. Ideally, systems would compute the framelevel likelihood of a tone u...
متن کاملIncorporating Pitch Features for Tone Modeling in Automatic Recognition of Mandarin Chinese
Tone plays a fundamental role in Mandarin Chinese, as it plays a lexical role in determining the meanings of words in spoken Mandarin. For example, these two sentences R R (I like horses) and R M (I like to scold) differ only in the tone carried by the last syllable. Thus, the inclusion of tone-related information through analysis of pitch data should improve the performance of automatic speech...
متن کاملA Pitch Smoothing Method for Mandarin Tone Recognition
Mandarin Chinese is known as a tonal language with four lexical tones. Tone recognition plays an important role in automatic Chinese speech recognition in that the same syllable with different tones gives quite distinct meanings. The different tone can be characterized by its pitch contour, but the pitch contours are hardly ideal smooth curves. It is because the pitch points calculated by pitch...
متن کاملTwo-stream modeling of Mandarin tones
Tone modeling is a critical component for Mandarin largevocabulary continuous-speech recognition systems. In previous work on pitch-feature extraction, we reported character error rate reductions of over 30% over the non-tonal baseline [1]. In this paper, we investigate how best to integrate tone modeling with a Mandarin LVCSR system. The paper focusses on the two-stream method, which is based ...
متن کامل